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Abstract — We give a general framework for construction 
of small ensembles of capacity achieving linear codes for a 
wide range of (not necessarily memoryless) discrete symmetric 
channels, and in particular, the binary erasure and symmetric 
channels. The main tool used in our constructions is the no- 
tion of randomness extractors and lossless condensers that are 
regarded as central tools in theoretical computer science. Same 
as random codes, the resulting ensembles preserve their capacity 
achieving properties under any change of basis. Our methods can 
potentially lead to polynomial-sized ensembles; however, using 
known explicit constructions of randomness conductors we obtain 
specific ensembles whose size is as small as quasipolynomial in 
the block length. By applying our construction to Justesen's 
concatenation scheme (Justesen, 1972) we obtain explicit capacity 
achieving codes for BEC (resp., BSC) with almost linear time 
encoding and almost linear time (resp., quadratic time) decoding 
and exponentially small error probability. The explicit code for 
BEC is defined and capacity achieving for every block length, a 
property lacked in previously known explicit constructions. 

I. Introduction 

One of the basic goals of coding theory is coming up 
with efficient constructions of error-correcting codes that allow 
reliable transmission of information over discrete communi- 
cations channels. Already in the seminal work of Shannon 
Q~), the notion of channel capacity was introduced which 
is a characteristic of the communications channel that deter- 
mines the maximum rate at which reliable transmission of 
information (i.e., with vanishing error probability) is possible. 
However, Shannon's result did not focus on the feasibility of 
the underlying code and mainly concerned with the existence 
of reliable, albeit possibly complex, coding schemes. Here 
feasibility can refer to a combination of several criteria, 
including: succinct description of the code and its efficient 
computability, the existence of an efficient encoder and an 
efficient decoder, the error probability, and the set of message 
lengths for which the code is defined. 

Besides heuristic attempts, there is a large body of rigorous 
work in the literature on coding theory with the aim of design- 
ing feasible capacity approaching codes for various discrete 
channels, most notably, the natural and fundamental cases 
of the binary erasure channel (BEC) and binary symmetric 
channel (BSC). Some notable examples in "modern coding" 
include Turbo codes and sparse graph codes (e.g., LDPC codes 
and Fountain codes, cf. [2], [3 |, |4|). These classes of codes are 
either known or strongly believed to contain capacity achieving 
ensembles for the erasure and symmetric channels. While such 
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codes are very appealing both theoretically and practically, and 
are in particular designed with efficient decoding in mind, in 
this area there still is a considerable gap between what we can 
prove and what is evidenced by practical results, mainly due 
to complex combinatorial structure of the code constructions. 
Moreover, almost all known code constructions in this area 
involve a considerable amount of randomness, which makes 
them prone to a possibility of design failure (e.g., choosing 
an "unfortunate" degree sequence for an LDPC code). While 
the chance of such possibilities is typically small, in general 
there is no known efficient way to certify whether a particular 
outcome of the code construction is satisfactory. Thus, it is 
desirable to come up with constructions of provably capacity 
achieving code families that are explicit, i.e., are efficient and 
do not involve any randomness. 

Explicit construction of capacity achieving codes was con- 
sidered as early as the classic work of Forney [5 1, who showed 
that concatenated codes can achieve the capacity of various 
memoryless channels. In this construction, an outer MDS code 
is concatenated with an inner code with small block length that 
can be found in reasonable time by a brute force search. An 
important subsequent work by Justesen |6| (that was originally 
aimed for explicit construction of asymptotically good codes) 
shows that it is possible to eliminate the brute force search by 
varying the inner code used for encoding different symbols of 
the outer encoding, provided that the ensemble of inner codes 
contains a large fraction of capacity achieving codes. 

Very recently, Arikan [7] gave a framework for deterministic 
construction of capacity achieving codes for discrete memo- 
ryless channels (DMCs) with binary input that are equipped 
with efficient encoders and decoders and attain slightly worse 
than exponentially small error probability. These codes are 
defined for every block length that is a power of two, which 
might be considered a restrictive requirement. Moreover, the 
construction is currently explicit (in the sense of polynomial- 
time computability of the code description) only for the special 
case of BEC and requires exponential time otherwise. 

In this work, we revisit the concatenation scheme of Juste- 
sen and give new constructions of the underlying ensemble of 
the inner codes. The code ensemble used in Justesen's original 
construction is attributed to Wozencraft. Other ensembles that 
are known to be useful in this scheme include the ensemble of 
Goppa codes and shortened cyclic codes (see [8|, Chapter 12). 
The number of codes in these ensembles is exponential in 
the block length and they achieve exponentially small error 
probability. These ensembles are also known to achieve the 



Gilbert- Varshamov bound, and owe their capacity achieving 
properties to the fact that each nonzero vector belongs to a 
small number of the codes in the ensemble. 

As our main result, we give a general framework for 
designing novel capacity achieving ensembles of small size, 
and in doing so we use fundamental tools from theoretical 
computer science that make our techniques radically different 
from the aforementioned results. In particular, we will use 
randomness extractors and lossless condensers (that belong to 
a broader family of objects collectively known as randomness 
conductors) as the main tools in our constructions. The quality 
of the underlying conductor determines the quality of the 
resulting code ensemble. In particular, the size of the code 
ensemble, the decoding error and proximity to the channel 
capacity are determined by the seed length, the error, and the 
output length of the conductor being used. 

We will present our code ensembles in Section [TT] and show 
their capacity achieving properties for BEC. In Section [HI] we 
will show that the ensemble obtained from lossless condensers 
can also achieve the capacity of BSC. In fact, we show that 
the same ensemble can simultaneously achieve the capacity 
of a much broader range of channels that do not necessarily 
need to be memoryless. Roughly speaking, the same ensem- 
ble is shown to be capacity achieving for any symmetric 
channel with arbitrarily distributed additive noise, and any 
erasure channel with an arbitrarily distributed erasure pattern. 
Moreover, we observe that our code ensembles preserve their 
capacity achieving properties under any change of basis. 

As a concrete example, we instantiate our construction with 
appropriate choices of the underlying conductor and obtain, for 
every block length n, a capacity achieving ensemble of size 
2" that attains exponentially small error probability for both 
erasure and symmetric channels (as well as the broader range 
of channels described above), and an ensemble of quasipoly- 
nomial size 2°( log n ) that attains the capacity of BEC. Us- 
ing certain optimal conductors that require logarithmic seed 
lengths, it is in principle possible to obtain polynomially small 
capacity achieving ensembles for any block length, though to 
this date no explicit construction of such conductors is known. 

Finally, in Section |IV] we apply our constructions to Juste- 
sen's concatenation scheme to obtain an explicit construction 
of capacity-achieving codes for both BEC and BSC that attain 
exponentially small error, as in the original construction of 
Forney. Moreover, the running time of the encoder is almost 
linear in the block length, and decoding takes almost linear 
time for BEC and almost quadratic time for BSC. Using our 
quasipolynomial-sized ensemble as the inner code, we are able 
to construct, for the first time, a fully explicit code for BEC 
that is defined and capacity achieving for every choice of the 
message length. 

A. Preliminaries 

The min-entropy of a distribution X with finite support S (in 
symbols, S := supp(A')) is given by min^gsl— logPr^(x)}, 
where Pr^(x) is the probability that X assigns to x. For 
flat distributions, i.e., those uniformly distributed on their 



support, this quantity coincides with the Shannon-entropy. The 
statistical distance of two distributions X and y defined on the 
same finite space S is given by ^X^seS |P r *(s) — P r y( s )| ; 
which is half the l\ distance of the two distributions when 
regarded as vectors of probabilities over S. Two distributions 
X and y are said to be e-close if their statistical distance 
is at most e. We will use the shorthand U n for the uniform 
distribution on F 2 , U n . p (0 < p < 1) for the uniform 
distribution on all the vectors in F 2 with Hamming weight 
at most pn, and X ~ X for a random variable X drawn 
from a distribution X. A function /: x ^ is a 
(strong) m — > e m! condenser if for every distribution X on 
F$ with min-entropy at least m, random variable X ~ X and 
a seed Y ~ Ud, the distribution of (Y, f(X,Y)) is e-close to 
a distribution (Ud, Z) with min-entropy at least d + m' . The 
parameters e, m, m — m' , and r — m' are called the error, 
the entropy requirement, the entropy loss and the overhead 
of the condenser, respectively. A condenser with zero entropy 
loss is called lossless, and a condenser with zero overhead 
is called a strong (m, e)-extractor. A condenser is explicit if 
it is polynomial-time computable and linear if it is a linear 
function in its first argument. 

II. Codes for the Binary Erasure Channel 

Any code with minimum distance d can tolerate up to d — 1 
erasures in the worst case, i.e., any pattern of up to d — 1 
erasures can be uniquely recovered regardless of the sent 
codeword. Thus one way to ensure reliable communication 
over BEC(p) is to use binary codes with relative minimum 
distance of about p. However, known negative bounds on the 
rate-distance trade-off (e.g., the MRRW bound, see J9)) do not 
allow the rate of such codes to approach the capacity 1 — p. 
However, by imposing the weaker requirement that most of 
the erasure patterns should be recoverable, it is possible to 
attain the capacity with a positive, but small, error probability. 
In this section we consider a different relaxation that preserves 
the worst-case guarantee on the erasure patterns; namely we 
consider ensembles of linear codes with the property that 
any pattern of up to p erasures must be tolerable by all 
but a negligible fraction of the codes in the ensemble. This 
in particular allows us to construct ensembles in which all 
but a negligible fraction of the codes are capacity achieving 
for BEC. Note that as we are only considering linear codes, 
recoverability of an erasure pattern S C [n] where n is 
the block length, i.e., the ability to uniquely reconstruct a 
transmitted codeword x when the set of coordinates of x 
determined by S is erased, is a property of the code and 
independent of x. 

In this section we introduce two constructions, which 
employ strong, linear extractors and lossless condensers as 
their main ingredients. Throughout the section we denote by 
/ : F 2 x F 2 — > F 2 a strong, linear, lossless condenser for min- 
entropy 771 and error e and by g: F 2 x F 2 — > F 2 a strong, 
linear extractor for min-entropy n—m and error e'. We assume 
that the errors e and e' are much smaller than 1, Using this 
notation, we define the ensembles as follows: 



Ensemble T: We define a code C y for each seed y e F| 
as follows: Let denote the r x n matrix that defines the 
linear function f(-,y), i.e., for each x £ FJg, -Hj/ • £ = /(a;, y). 
Then is a parity check matrix for C y . 

Ensemble Q: We define a code for each seed y £ F 2 
as follows: Let G a denote the k x n matrix that defines the 
linear function g(-,y). Then G y is a generator matrix for C' y . 

Obviously, the rate of each code in T is at least 1 — r/n. 
Moreover, as g is a strong extractor we can assume wlog that 
the rank of each G y is exactly k. Thus, each code in Q has 
rate k /n. The following lemma is our main tool in quantifying 
the erasure decoding capabilities of the two ensembles: 

Lemma 1: Let S C [n] be a set of size at most to. Then all 
but a sfi fraction of the codes in T and all but a \[7' fraction 
of those in Q can tolerate the erasure pattern defined by S. 

Proof: We prove the result for the ensemble Q. The 
argument for T is similar. Consider a probability distribution 
S on F2 that is uniform on the coordinates specified by 
S := [n] \ S and fixed to zeros elsewhere. Thus the min- 
entropy of S is n—m, and the distribution (Y, g(S, Y)), where 
Y ~ Ud, is e'-close to Ud'+k- By an averaging argument, for 
all but a \ft' fraction of the choices of y £ F2 , the distribution 
of g(S,y) is -v/e'-close to Uk- Fix such a y. In fact, since the 
support of S is a linear subspace of FJJ and the function g(-,y) 
is linear, the distribution of g(S,y) must be exactly uniform. 
Thus, the k x m submatrix of G y consisting of the columns 
picked by S must have rank fc, which implies that for every 
x 6 Fj, the projection of the encoding x-G y to the coordinates 
chosen by S uniquely identifies x. ■ 

The lemma combined with a double averaging argument 
implies the following corollary: 

Corollary 2: Let S be any distribution on the subsets of [n] 
of size at most m. Then all but an e 1 / 4 (resp., e' 1 / 4 ) fraction of 
the codes in T (resp., Q) can tolerate erasure patterns sampled 
from S with probability at least 1 — e 1 / 4 (resp., 1 — e' 1 / 4 ). 

Note that the result holds irrespective of the distribution S, 
contrary to the familiar case of BEC(p) for which the erasure 
pattern has an iid distribution. For the case of BEC(p), the 
erasure pattern (regarded as its binary characteristic vector 
in Fr?) is given by S := (Si,...,S n ), where the random 
variables Si, . . . , S n € F2 are iid and Pi[Si = 1] = p. We 
denote this particular distribution by B n ,p, which assigns a 
nonzero probability to every vector in Fj. Thus in this case 
we cannot directly apply Corollary [2] However, note that B n . p 
can be written as a convex combination 

B n>P = (l-7R,y+7^, (1) 

for p' := p + 0,(1) that is arbitrarily close to p, where T> is 
an "error distribution" whose contribution 7 is exponentially 
small. The distribution U n y is only supported on vectors of 
weight at most np', for which the above result applies by 
setting to = np'. Moreover, by the convex combination above, 
the erasure decoding error probability of any code for erasure 
pattern distributions B n . p and U n y differ by no more than 7. 
Therefore, the above result applied to the erasure distribution 



U n , P > handles the particular case of BEC(p) with essentially 
no change in the error probability. 

In light of Corollary |2] in order to obtain rates arbitrarily 
close to the channel capacity, the output lengths of / and g 
must be sufficiently close to the entropy requirement m. More 
precisely, it suffices to have r < (1 + a)m and fc > (1 — a)m 
for arbitrarily small constant a > 0. The seed length of / 
and g determine the size of the code ensemble. Moreover, the 
error of the extractor and condenser determine the erasure error 
probability of the resulting code ensemble. As achieving the 
channel capacity is the most important concern for us, we will 
need to instantiate / (resp., g) with a linear, strong, lossless 
condenser (resp., extractor) whose output length is close to m. 
We mention one such instantiation for each function. 

For both function / and g, we can use the following 
straightforward generalization of the well-known leftover hash 
lemma 11 1 01 : 

Lemma 3: Let ip: FJf — > F2^ be an arbitrary isomorphism 
between the vector space Fj and the extension field F2™. 
Define JiFJxF^ FJj, where r := m + 21og(l/e), and 
g: F£ x F% — > F|, where k := m — 21og(l/e), as follows: 
For x,y £ F 2 \ let z(x, y) := ip^ 1 (ip(x) ■ ip(y)). Then f(x,y) 
(resp., g(x, y)) is the projection of z(x, y) onto its first r (resp., 
k) coordinates. The function / (resp., g) is a linear, strong, 
lossless condenser (resp., extractor) for all min-entropies of 
up to m (resp., at least in). 

The above construction is optimal in the output length, 
but requires a large seed, namely, d = n. Thus the re- 
sulting ensemble will have size 2", but attains a positive 
error exponent 8/2 for an arbitrary rate loss 5 > 0. Using 
an optimal lossless condenser or extractor with seed length 
d = log(n) + 0(log(l/e)) and output length close to m, it 
is possible to obtain a polynomially small capacity-achieving 
ensemble; however, to this date no explicit construction of a 
linear condenser or extractor with this property is known. In 
the world of linear extractors, an important construction due 
to Trevisan ifTTl gets rather close to the optimal seed length. 
Here we mention an improvement of this result due to Raz 
et al. 02): 

Theorem 4: [12] For all positive integers n, m and e > 
0, there is an explicit strong linear seeded (to, e)-extractor 
g: FJxFf -> Ff with d = 0(\og 3 {n/e)) and k = to- 0(d). 

As a concrete result, we combine this theorem with Corol- 
lary |2] and the discussion above to obtain the following: 

Corollary 5: Let p, c > be arbitrary constants. Then for 
every integer n > 0, there is a constructible ensemble Q of 
linear codes of rate 1 — p — o(l) such that, the size of Q 
is quasipolynomial, i.e., \Q\ = 2°( c log n \ and, all but an 
nT c = o(l) fraction of the codes in the ensemble have error 
probability at most n~ c when used over BEC(p). 

One can also use Lemma [5] instead and obtain a larger 
ensemble of size 2 n but with an exponentially small error 
probability and exponentially small fraction of "bad" codes. 



III. Codes for the Binary Symmetric Channel 

The goal of this section is to design capacity achieving code 
ensembles for the binary symmetric channel BSC(p). As it 
turns out, our result applies to more sophisticated symmetric 
channels that do not need to be memoryless. Thus, we first 
introduce a generalization of the standard notion of BSC 
to binary and symmetric channels with arbitrarily distributed 
additive binary noise. 

For an integer n > 0, let Z be a probability distribution on 
F 2 " . Consider a DMSC 6 with input and output alphabet F 2 ™ 
that maps X E F 2 ™ to X + Z, where Z ~ Z is an independent 
channel noise. Then each use of C can be naturally regarded 
as n binary channel uses, resulting in a binary channel that 
we denote by BSC(Z). The special case BSC(p) is obtained 
by setting Z — B n , p - It is easy to see that the capacity of 
BSC(Z) is at most 1 — h(Z), where h(Z) is the entropy rate 
of Z. 

In this section we extend our framework for construction of 
small ensembles of linear codes that combinatorially achieve 
this capacity (i.e., under ML decoding) when Z is a fiat dis- 
tribution, and in particular use it to obtain capacity achieving 
codes for BSC(p). The code ensemble that we use for the 
symmetric channel is the ensemble T that we introduced in 
the preceding section. Thus, we adopt the notation that we 
used before for defining the ensemble T . Recall that each 
code in the ensemble has rate at least 1 — r/n. Moreover, for 
each y G Ff, denote by £(C y ,Z) the error probability of the 
ML decoder for code C y over BSC(Z). The following lemma 
quantifies this probability: 

Lemma 6: Let Z be a flat distribution with entropy m. Then 
for at least a 1 — 2^/e fraction of the choices of y G F 2 , we 
have £(C y ,Z) < Ji. 

Proof: Let X ~ Z, Y ~ U d , and define F := f(X, Y). 
Thus, the distribution of (Y, F) is e-close to {Ud,T>), where 
T> has min-entropy at least m. For y G F 2 and s G F 2 , Define 
N(y,s) := \{x G F^: f(x,y) = s}\. Thus, Pr[F = s\Y = 
y] = N(y,s)/2 m . Using this notation, we can write 

J2 £ \N(y,s)-2 m Prv(s)\ < e 2 m+d +\ 

j£F^ sesupp(F) 

by the fact that (Y, F) ~ e {U d ,V). As the min-entropy of V 
is at least m, Pro(s) is always no more than 2~ m , thus the 
quantity inside the absolute value is always non-negative and 
we have 

2 ~"E E (N(y, S )-l)<e2 m+1 . (2) 

ySF^ sgsupp(F) 

We call a seed y G Ff good if Pr[iV(y, F) > l\Y = y] < Jl. 
Thus, the ML decoder will have error probability at most i/e 
for all choices of y that are good, in which case the probability 
of "syndrome confusion" is small. An averaging argument on 
d2J reveals that for all but a 2%fi fraction of the choices of 
y, we have (1 - ^Je)2 m < |supp(F)| < 2 m , which easily 
implies that \{x G supp(Z): N(y,f(x,y)) > l}\ < yfe2 m . 



Hence, any such y is good and defines a code C y for which 

Z(C y ,z)<Ji. ' " m 

The lemma implies that any lossless condenser with entropy 
requirement m can be used to construct an ensemble of codes 
such that all but a small fraction of the codes are good for 
reliable transmission over BSC(Z), where Z is an arbitrary 
fiat distribution with entropy at most m. Similar to the case of 
BEC, the seed length determines the size of the ensemble, 
the error of the condenser bounds the error probability of 
the decoder, and the output length determines the proximity 
of the rate to the capacity of the channel. Again, using the 
condenser given by Lemma [3j we can obtain a capacity 
achieving ensemble of size 2". 

It is not hard to see that the converse of the above result 
is also true; namely, that any ensemble of linear codes that 
is universally capacity achieving with respect to any noise 
distribution with a particular entropy defines a strong linear, 
lossless, condenser. Thus the known lower bounds on the 
seed length and the output length of lossless condensers |[T3ll 
translate into lower bounds on the size of the code ensemble 
and proximity to the capacity. In particular, in order to get 
a positive error exponent, the size of the ensemble must be 
exponentially large. 

We also point out that the code ensembles T and Q dis- 
cussed in this and the preceding section preserve their erasure 
and error correcting properties under any change of basis in 
the ambient space F 2 , due to the fact that a change of basis 
applied on any linear condenser results in a linear condenser 
with the same parameters. This is an property achieved by the 
trivial, but large, ensemble of codes defined by the set of all 
r x n parity check matrices. Note that no single code can be 
universal in this sense, and it is inevitable to have a sufficiently 
large ensemble to attain this property. 

For the special case of BSC(p), the noise distribution B n ,p is 
not a flat distribution. However, we can again use the convex 
combination (HJ and note that the distribution U n>p > is fiat, 
with entropy m = n(h(p') + o(l)). Thus, the capacity of 
BSC(W„ jP ') and BSC(p) are the same, up to a negligible term. 
Moreover, by the convex combination, the error probability of 
any code over these two channels differs by no more than 7. 
Therefore, in order to achieve the capacity of BSC(p) it is 
sufficient to achieve the capacity of BSC(U n p ) instead, which 
is covered by our result above. 

IV. Explicit Capacity Achieving Codes 

In this section we apply our construction of capacity 
achieving ensembles to Justesen's concatenation scheme and 
obtain an explicit construction of capacity achieving codes. 
Our analysis closely follows the original analysis of Forney, 
with only minor modifications. 

Suppose that S is an ensemble of linear codes with block 
length n and rate R, for which it is guaranteed that all but 
a o(l) fraction of of the codes are capacity achieving (for 
a particular DMSC, in our case either BEC(p) or BSC(p)) 
with some vanishing (i.e., o(l)) error probability. As an outer 
code we use an expander-based construction of asymptotically 



good codes due to Spielman [14| which is summerized in the 
following theorem: 

Theorem 7: For every integer k > and every absolute 
constant R' < 1, there is an explicit family of F 2 -linear codes 
over W 2 k for every block length and rate R 1 that is error- 
correcting for an 0,(1) fraction of errors. The running time of 
the encoder and the decoder is linear in the bit-length of the 
codewords. 

For every n > 0, we use the above code with block length 
|«S|, alphabet size determined by setting k — [Rn\ , and rate R' 
close to 1, as the outer code and encode the ith symbol of each 
codeword using the ith code in the ensemble S. The resulting 
binary concatenated code C will have rate RR', that can be 
made arbitrarily close to R by choosing R' appropriately, 
and block length N = n\S\. Moreover, the running time 
of the encoder is 0(nN) by Theorem [7] For decoding, we 
use a naive encoder that applies an ML decoder to the inner 
codes followed by the outer decodeiQ given by Theorem [7] 
The running time of a trivial ML decoder for the inner code 
is polynomial in n for erasure decoding, but exponential in 
presence of errors. Thus, the decoder for the concatenated code 
will have running time 0(n 2 N) over BEC and Oin 2 2 Rn N) 
over BSC. As a concrete example, when |<S| = 2™, we get 
a decoding running time of O(N) for BEC and 0(N 2 ) for 
BSC. 

The error probability of the concatenated code can be 
analyzed in the same way as in |5); here we reproduce 
a crude analysis for completeness: Consider communication 
over BSC(p) (the analysis for BEC(p) is similar), and assume 
that the inner ensemble is capacity achieving for this channel 
with rate R that is slightly below the capacity, and that R' is 
chosen sufficiently close to 1 so that RR' = l—h(p)—S, where 
5 > is an arbitrarily small constant. As all but a vanishing 
fraction of the codes in S attain a vanishing error probability 
over BSC (p), we expect that |«S|(1— o(l))of the inner decoders 
succeed in recovering the correct transmitted sequence, and 
thus, that the sequence delivered to the outer decoder is 
corrupted in o(l) fraction of the coordinates, a situation that 
can be handled by the outer decoder. By observing the fact 
that the channel is memoryless, it is easy to see (e.g., using 
Chernoff bounds) that the error probability (i.e., the probability 
that the outer decoder observes more corruptions than it can 
handle) is upper bounded by 2~ n ( N \ 

A drawback of the concatenation scheme of Justesen that 
we used above is that the code C is defined for block lengths 
of the form ns(n), where s(n) is the size of the code ensemble 
S for the inner block length n, and thus the density of the set 
of the block lengths (and thus message lengths) for which 
the code is defined depends on the growth rate of s(n). 
Obviously, one can extend the code construction in a trivial 

1 Alternatively, one can use a Reed-Solomon code as the outer code 
combined with GMD decoding, as in the original work of Forney, and obtain 
a slightly better error exponent. However, using Spielman's code results in a 
more efficient encoder and decoder for the concatenated code, that we find 
more favorable. 

2 We remark that the recent construction of Arikan [7] also suffers from a 
similar restriction. 



way to make it defined for every block length N by taking the 
largest block length ns(n) in the original construction that is 
below the given N and padding the encoding by a sufficient 
amount of redundant symbols. However, this will decrease the 
rate of the original code by a multiplicative factor of about 
s(n + l)/s(n). Thus, the code will lose its capacity achieving 
properties if this fraction is away from 1 by more than a 
constant, i.e., when the ensemble S is exponentially large. 
Therefore, in order to obtain an explicit capacity achieving 
code that is defined for every block length, it suffices to 
have a subexponential-sized capacity achieving ensemble. As 
mentioned in the introduction, all previously known ensembles 
applicable in concatenation schemes are exponentially large, 
whereas our framework is potentially capable of producing 
polynomial-sized ensembles. Indeed, the quasipolynomial- 
sized ensemble of Corollary pleads to an explicit erasure code 
that is capacity achieving for every length with almost linear 
time (i.e., 0(N 1+o ^)) encoding and decoding. 

Though we analyzed the concatenation scheme for the 
special case of BSC(p), we remark that the application of 
our conductor-based ensembles to Justesen's scheme leads 
to explicit codes that simultaneously achieve the capacity of 
a more general class of channels, due to the universality 
properties of the ensembles discussed in preceding sections. As 
an example, we mention a generalization of the BSC(p) where 
the individual bit flip probability of each channel use might 
vary but its average binary entropy is known, or similarly, a 
variation of BEC(p) with varying bit erasure probabilities. 
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